A Robust Active Learning Framework Using Itemset Based Dynamic Rule Sampling
نویسندگان
چکیده
Active learning is a rapidly growing field of machine learning which aims at reducing the labeling effort of the oracle (human expert) in acquiring informative training samples in domains where the cost of labeling is high. Associative classification is a well established prediction method which possesses the advantages of high accuracy and faster learning rates in classification. In this paper, we propose a novel algorithm which unifies associative classification with active learning. The algorithm has two major procedures of Rule generation and rule pruning. The algorithm selects unlabeled instances from the pool of available samples and uses a unique dynamic rule sampling procedure for updating the model. The rules are dynamically sampled class association rules (CAR) which are generated using the mined Minimal infrequent itemsets. The results derived over 10 datasets from the UCI-ML repository for our approach have been compared with those from the ACTIVE-DECORATE algorithm. We also analyze our sampling method against the state of art sampling frameworks and show that our method performs better.
منابع مشابه
A DIC-based Distributed Algorithm for Frequent Itemset Generation
A distributed algorithm based on Dynamic Itemset Counting (DIC) for generation of frequent itemsets is presented by us. DIC represents a paradigm shift from Apriori-based algorithms in the number of passes of the database hence reducing the total time taken to obtain the frequent itemsets. We exploit the advantage of Dynamic Itemset Counting in our algorithmthat of starting the counting of an i...
متن کاملReview on Matrix Based Efficient Apriori Algorithm
www.ijitam.org Abstract These Apriori Algorithm is one of the wellknown and most widely used algorithm in the field of data mining. Apriori algorithm is association rule mining algorithm which is used to find frequent itemsets from the transactions in the database. The association rules are then generated from these frequent itemsets. The frequent itemset mining algorithms discover the frequent...
متن کاملItemset Size - Sensitive Interestingness Measures for Association Rule Mining and Link Prediction
Association rule learning is a data mining technique that can capture relationships between pairs of entities in different domains. The goal of this research is to discover factors from data that can improve the precision, recall, and accuracy of association rules found using interestingness measures and frequent itemset mining. Such factors can be calibrated using validation data and applied t...
متن کاملImplementation of Efficient Algorithm for Mining High Utility Itemsets in Distributed and Dynamic Database
Association Rule Mining (ARM) is finding out the frequent itemsets or patterns among the existing items from the given database. High Utility Pattern Mining has become the recent research with respect to data mining. The proposed work is High Utility Pattern for distributed and dynamic database. The traditional method of mining frequent itemset mining embrace that the data is astride and sedent...
متن کاملAn Novel Artificial Immune System Approach to Robust Data Mining
We introduce several enhancements to deal with some of the weaknesses of previous artificial immune system models. Then, we present a framework for the accomplishment of several classical data mining tasks, such as frequent itemset discovery and robust clustering, based on ideas inspired from the natural immune system coupled with soft computing. For instance, we implement an artificial immune ...
متن کامل